Multimodal Information Bottleneck: Learning Minimal Sufficient Unimodal and Multimodal Representations

نویسندگان

چکیده

Learning effective joint embedding for cross-modal data has always been a focus in the field of multimodal machine learning. We argue that during fusion, generated may be redundant, and discriminative unimodal information ignored, which often interferes with accurate prediction leads to higher risk overfitting. Moreover, representations also contain noisy negatively influences learning dynamics. To this end, we introduce bottleneck (MIB), aiming learn powerful sufficient representation is free redundancy filter out representations. Specifically, inheriting from general (IB), MIB aims minimal given task by maximizing mutual between target simultaneously constraining input data. Different IB, our regularizes both representations, comprehensive flexible framework compatible any fusion methods. develop three variants, namely, early-fusion MIB, late-fusion complete on different perspectives constraints. Experimental results suggest proposed method reaches state-of-the-art performance tasks sentiment analysis emotion recognition across widely used datasets. The codes are available at https://github.com/TmacMai/Multimodal-Information-Bottleneck.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multimodal Versus Unimodal Instructions

This module provides an overview of multimodal perception, including information Your nose might even be stimulated by the smell of burning rubber or gasoline. In other words, how does the perceptual system determine which unimodal between the two balls that then bounce off each other in opposite directions. Principles and heuristics for designing minimalist instruction. H Van der Multimodal ve...

متن کامل

Learning Stimulus-Location Associations in 8- and 11-Month-Old Infants: Multimodal versus Unimodal Information.

Research on the influence of multimodal information on infants' learning is inconclusive. While one line of research finds that multimodal input has a negative effect on learning, another finds positive effects. The present study aims to shed some new light on this discussion by studying the influence of multimodal information and accompanying stimulus complexity on the learning process. We ass...

متن کامل

Image Pivoting for Learning Multilingual Multimodal Representations

In this paper we propose a model to learn multimodal multilingual representations for matching images and sentences in different languages, with the aim of advancing multilingual versions of image search and image understanding. Our model learns a common representation for images and their descriptions in two different languages (which need not be parallel) by considering the image as a pivot b...

متن کامل

Unimodal & Multimodal Biometric Recognition Techniques A Survey

Biometric recognition refers to an automatic recognition of individuals based on a feature vector(s) derived from their physiological and/or behavioral characteristic. Biometric recognition systems should provide a reliable personal recognition schemes to either confirm or determine the identity of an individual. These features are used to provide an authentication for computer based security s...

متن کامل

Establishing and maintaining perceptual coherence: unimodal and multimodal evidence

How does a listener find and follow the speech of a talker? Many classic and contemporary accounts of speech perception start with a coherent sensory sample of speech already established, as if the perceptual world consisted solely of speech, and as if no more than a single talker ever spoke at once. Long-established and recent characterizations alike have cast the fundamental problem of speech...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Multimedia

سال: 2022

ISSN: ['1520-9210', '1941-0077']

DOI: https://doi.org/10.1109/tmm.2022.3171679